我们介绍了认知神经生成系统(Cogngen),这是一种结合了两个神经生物学上可行的计算模型的认知结构:预测性处理和超值/矢量符号模型。我们从ACT-R和Spaun/Nengo等体系结构中汲取灵感。Cogngen与这些相吻合,在ACT-R对人类认知的高级象征描述与Spaun的低水平神经生物学描述之间提供了一定程度的细节,此外,为设计代理人创造了基础,以从多种任务中学习并模拟人类绩效的基础。在比当前系统所能的范围更大的情况下。我们在四个迷宫学习任务上测试Cogngen,包括测试内存和计划的任务,并发现Cogngen与深度强化学习模型的性能相匹配,并且超出了旨在测试内存的任务。
translated by 谷歌翻译
We derive a set of causal deep neural networks whose architectures are a consequence of tensor (multilinear) factor analysis. Forward causal questions are addressed with a neural network architecture composed of causal capsules and a tensor transformer. The former estimate a set of latent variables that represent the causal factors, and the latter governs their interaction. Causal capsules and tensor transformers may be implemented using shallow autoencoders, but for a scalable architecture we employ block algebra and derive a deep neural network composed of a hierarchy of autoencoders. An interleaved kernel hierarchy preprocesses the data resulting in a hierarchy of kernel tensor factor models. Inverse causal questions are addressed with a neural network that implements multilinear projection and estimates the causes of effects. As an alternative to aggressive bottleneck dimension reduction or regularized regression that may camouflage an inherently underdetermined inverse problem, we prescribe modeling different aspects of the mechanism of data formation with piecewise tensor models whose multilinear projections are well-defined and produce multiple candidate solutions. Our forward and inverse neural network architectures are suitable for asynchronous parallel computation.
translated by 谷歌翻译
A "heart attack" or myocardial infarction (MI), occurs when an artery supplying blood to the heart is abruptly occluded. The "gold standard" method for imaging MI is Cardiovascular Magnetic Resonance Imaging (MRI), with intravenously administered gadolinium-based contrast (late gadolinium enhancement). However, no "gold standard" fully automated method for the quantification of MI exists. In this work, we propose an end-to-end fully automatic system (MyI-Net) for the detection and quantification of MI in MRI images. This has the potential to reduce the uncertainty due to the technical variability across labs and inherent problems of the data and labels. Our system consists of four processing stages designed to maintain the flow of information across scales. First, features from raw MRI images are generated using feature extractors built on ResNet and MoblieNet architectures. This is followed by the Atrous Spatial Pyramid Pooling (ASPP) to produce spatial information at different scales to preserve more image context. High-level features from ASPP and initial low-level features are concatenated at the third stage and then passed to the fourth stage where spatial information is recovered via up-sampling to produce final image segmentation output into: i) background, ii) heart muscle, iii) blood and iv) scar areas. New models were compared with state-of-art models and manual quantification. Our models showed favorable performance in global segmentation and scar tissue detection relative to state-of-the-art work, including a four-fold better performance in matching scar pixels to contours produced by clinicians.
translated by 谷歌翻译
Diffusion models have achieved justifiable popularity by attaining state-of-the-art performance in generating realistic objects from seemingly arbitrarily complex data distributions, including when conditioning generation on labels. Unfortunately, however, their iterative nature renders them very computationally inefficient during the sampling process. For the multi-class conditional generation problem, we propose a novel, structurally unique framework of diffusion models which are hierarchically branched according to the inherent relationships between classes. In this work, we demonstrate that branched diffusion models offer major improvements in efficiently generating samples from multiple classes. We also showcase several other advantages of branched diffusion models, including ease of extension to novel classes in a continual-learning setting, and a unique interpretability that offers insight into these generative models. Branched diffusion models represent an alternative paradigm to their traditional linear counterparts, and can have large impacts in how we use diffusion models for efficient generation, online learning, and scientific discovery.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
The NASA Astrophysics Data System (ADS) is an essential tool for researchers that allows them to explore the astronomy and astrophysics scientific literature, but it has yet to exploit recent advances in natural language processing. At ADASS 2021, we introduced astroBERT, a machine learning language model tailored to the text used in astronomy papers in ADS. In this work we: - announce the first public release of the astroBERT language model; - show how astroBERT improves over existing public language models on astrophysics specific tasks; - and detail how ADS plans to harness the unique structure of scientific papers, the citation graph and citation context, to further improve astroBERT.
translated by 谷歌翻译
我们提供了证据表明,学到的密度功能理论(``dft')的力场已准备好进行基态催化剂发现。我们的关键发现是,尽管预测的力与地面真相有很大差异,但使用从超过50 \%的评估系统中使用RPBE功能的能量与使用RPBE功能相似或较低能量的力量的力量与使用RPBE功能相似或较低的力量放松。这具有令人惊讶的含义,即学习的潜力可能已经准备好在挑战性的催化系统中替换DFT,例如在Open Catalyst 2020数据集中发现的电位。此外,我们表明,在局部谐波能量表面上具有与目标DFT能量相同的局部谐波能量表面训练的力场也能够在50 \%的情况下找到较低或相似的能量结构。与在真实能量和力量训练的标准模型相比,这种``简易电位''的收敛步骤更少,这进一步加速了计算。它的成功说明了一个关键:即使模型具有高力误差,学到的电位也可以定位能量最小值。结构优化的主要要求仅仅是学到的电位具有正确的最小值。由于学到的电位与系统大小的速度快速且尺寸为线性,因此我们的结果开辟了快速找到大型系统基础状态的可能性。
translated by 谷歌翻译
最近,致力于通过现代机器学习方法预测脑部疾病的最新神经影像学研究通常包括单一模态并依靠监督的过度参数化模型。但是,单一模态仅提供了高度复杂的大脑的有限视图。至关重要的是,临床环境中的有监督模型缺乏用于培训的准确诊断标签。粗标签不会捕获脑疾病表型的长尾谱,这导致模型的普遍性丧失,从而使它们在诊断环境中的有用程度降低。这项工作提出了一个新型的多尺度协调框架,用于从多模式神经影像数据中学习多个表示。我们提出了一般的归纳偏见分类法,以捕获多模式自学融合中的独特和联合信息。分类法构成了一个无解码器模型的家族,具有降低的计算复杂性,并捕获多模式输入的本地和全局表示之间的多尺度关系。我们使用各种阿尔茨海默氏病表型中使用功能和结构磁共振成像(MRI)数据对分类法进行了全面评估,并表明自我监督模型揭示了与疾病相关的大脑区域和多模态链接,而无需在预先访问PRE-PRE-the PRE-the PRE-the PRE-the PRE-PRECTEN NICKES NOCKER NOCKER NOCKER NOCKER NOCKER NOCE访问。训练。拟议的多模式自学学习的学习能够表现出两种模式的分类表现。伴随的丰富而灵活的无监督的深度学习框架捕获了复杂的多模式关系,并提供了符合或超过更狭窄的监督分类分析的预测性能。我们提供了详尽的定量证据,表明该框架如何显着提高我们对复杂脑部疾病中缺失的联系的搜索。
translated by 谷歌翻译
视力转换器被广泛用于各种视觉任务。同时,从MLP-Mixer开始尝试使用基于MLP的体系结构实现类似性能的一系列作品。有趣的是,到目前为止,没有人报告使用它们执行NLP任务,此外,直到现在,这些基于MLP的架构却没有声称可以实现视觉任务最新的架构。在本文中,我们分析了基于MLP的体系结构同时在多个不同输入之间建模依赖性中的表达能力,并显示了注意力与基于MLP的机制之间的指数差距。我们的结果表明,MLP无法与NLP问题中的基于注意力的机制竞争的理论解释,他们还表明,视觉任务的性能差距可能是由于MLP相对弱点在多个不同位置之间的建模依赖性中的相对弱点所致,并且结合在一起。对MLP体系结构的智能输入排列可能不足以缩小性能差距。
translated by 谷歌翻译
高参数调整是改善神经网络性能的常见技术。大多数用于超参数搜索的技术都涉及一个迭代过程,在该过程中,在每次迭代中都会重新训练模型。但是,每次其他搜索迭代的预期准确性提高,仍然未知。计算预期的改进可以帮助创建超参数调整规则,并允许对项目的计算预算进行更明智的分配。在本文中,我们从额外的超参数搜索迭代中提高了预期准确性提高的经验估计。我们的结果适用于基于随机搜索\ cite {bergstra2012random}的任何超参数调整方法,并从固定分布中采样超参数。我们以$ o \ left的错误(\ sqrt {\ frac {\ log k} {k}}} \ right)$ o \ left(\ sqrt {\ frac {\ frac {\ frac {\ right)$ w.h.p.其中$ k $是当前的迭代次数。据我们所知,这是从额外的超参数搜索迭代中获得预期增益的第一键。最后,我们证明了预期准确性的最佳估计值仍将具有$ \ frac {1} {k} $的错误。
translated by 谷歌翻译